Topic 8: Exploiting Time Variation
London School of Economics and Political Science
February 9, 2026
Following the same units over time enables new identification and estimation strategies.
\[\Delta y_{it} = \beta \Delta x_{it} + \Delta u_{it}\]
\[y_{it} = \beta x_{it} + \sum_{j=2}^{N} \gamma_j \mathbb{1}[i=j] + u_{it}\]
\[ \begin{aligned} \log(\text{fare})_{it} &= \beta_0 + \beta_1\log(\text{distance})_i + \beta_2\text{competition}_{it} + e_{it} \\ e_{it} &= \gamma_i + \delta_t + u_{it} \end{aligned} \]
\[\begin{align*} \widehat{\log(\text{fare})}_{it} &= \hat{\beta}_{0} + \hat{\beta}_{1}\log(\text{distance})_{i} \\ &+ \hat{\beta}_{2}\text{competition}_{it} \\ &+ \hat{\delta}_{1}\mathbb{1}[t=2007] \\ &+ \hat{\delta}_{2}\mathbb{1}[t=2012] \end{align*}\]
Since \(\text{Cov}(\text{competition}_{it}, \gamma_i) \neq 0\), OLS is biased.
\[ \Delta\log(\text{fare})_{it} = \beta_2\Delta\text{competition}_{it} + \Delta\delta_t + \Delta u_{it} \]
\(\gamma_i - \gamma_i = 0\): time-invariant route characteristics disappear.
With year dummies for transition periods (2002–2007 base, 2007–2012):
\[ \widehat{\Delta\log(\text{fare})}_{it} = \hat\alpha + \hat\beta_2\Delta\text{competition}_{it} + \hat\delta\mathbb{1}[\text{transition } 2007-2012] \]
(1) FD + year dummies, or (2) full LSDV with dummies for both units and time periods.
\[\begin{align*} \log(\text{patents})_{it} &= \beta_0 + \beta_1\log(\text{R\&D})_{it} \\ &+ \beta_2\mathbb{1}[t=2006] + \beta_3\mathbb{1}[t=2007] \\ &+ e_{it} \end{align*}\]
\[\begin{align*} \mathbb{E}[\log(\text{patents})_{it} \mid t=2005] &= \beta_0 + \beta_1\log(\text{R\&D})_{it} \\ \mathbb{E}[\log(\text{patents})_{it} \mid t=2006] &= (\beta_0 + \beta_2) + \beta_1\log(\text{R\&D})_{it} \\ \mathbb{E}[\log(\text{patents})_{it} \mid t=2007] &= (\beta_0 + \beta_3) + \beta_1\log(\text{R\&D})_{it} \end{align*}\]
Year dummies measure growth rates between periods — not “the level in 2006 vs 2005.”
\[ \begin{align*} \log(\text{patents})_{it} &= \beta_0 + \beta_1\log(\text{R\&D})_{it} + \beta_2\mathbb{1}[t=2006] + \beta_3\mathbb{1}[t=2007] \\ &+ \beta_4(\log(\text{R\&D})_{it} \times \mathbb{1}[t=2006]) + \beta_5(\log(\text{R\&D})_{it} \times \mathbb{1}[t=2007]) \\ &+ e_{it} \end{align*} \]
\[\begin{align*} \mathbb{E}[\log(\text{patents})_{it} \mid t=2005] &= \beta_0 + \beta_1\log(\text{R\&D})_{it} \\ \mathbb{E}[\log(\text{patents})_{it} \mid t=2006] &= (\beta_0 + \beta_2) + (\beta_1 + \beta_4)\log(\text{R\&D})_{it} \\ \mathbb{E}[\log(\text{patents})_{it} \mid t=2007] &= (\beta_0 + \beta_3) + (\beta_1 + \beta_5)\log(\text{R\&D})_{it} \end{align*}\]
| Year | Intercept | Elasticity |
|---|---|---|
| 2005 | \(\beta_0\) | \(\beta_1\) |
| 2006 | \(\beta_0 + \beta_2\) | \(\beta_1 + \beta_4\) |
| 2007 | \(\beta_0 + \beta_3\) | \(\beta_1 + \beta_5\) |
No need to interpret \(\beta_4\) and \(\beta_5\) individually; the conditional means do the work.
\[ \begin{align*} \log(\text{wages})_{it} &= \beta_0 + \beta_1\mathbb{1}[i\text{ is male}] + \beta_2 t + \beta_3\mathbb{1}[t \geq 2005] \\ &+ \beta_4(\mathbb{1}[i\text{ is male}] \times t) + \beta_5(\mathbb{1}[i\text{ is male}] \times \mathbb{1}[t \geq 2005]) \\ &+ \beta_6(t \times \mathbb{1}[t \geq 2005]) + \beta_7(t \times \mathbb{1}[t \geq 2005] \times \mathbb{1}[i\text{ is male}]) \\ &+ e_{it} \end{align*} \]
This model captures level differences, trends, and how both changed after 2005, separately for men and women.
\[ \mathbb{E}[\log(\text{wages})_{it} \mid \mathbb{1}[i \text{ is male}]=0,t<2005] = \beta_0 + \beta_2 t \]
\[ \mathbb{E}[\log(\text{wages})_{it} \mid \mathbb{1}[i \text{ is male}]=1,t<2005] = (\beta_0 + \beta_1) + (\beta_2 + \beta_4)t \]
\(\beta_1\) shifts the intercept; \(\beta_4\) shifts the slope.
\[ \mathbb{E}[\log(\text{wages})_{it} \mid \mathbb{1}[i \text{ is male}]=0,\; t \geq 2005] = (\beta_0 + \beta_3) + (\beta_2 + \beta_6)t\]
\[\begin{align*} \mathbb{E}[\log(\text{wages})_{it} \mid \mathbb{1}[i \text{ is male}]=1,\; t \geq 2005] &= (\beta_0 + \beta_1 + \beta_3 + \beta_5) \\ &\quad + (\beta_2 + \beta_4 + \beta_6 + \beta_7)t \end{align*}\]
Each coefficient modifies either the intercept or slope for a specific group-period combination.
\[ e_{it} = \alpha_i + v_{it} \]
\[ \Delta y_i = \delta + \beta_1\Delta x_{i1} + \cdots + \beta_k\Delta x_{ik} + \Delta v_i \]
Example: cannot estimate returns to education via FD if education does not change over time.
Less scope for OVB, but not zero.
\[ \text{education}_{it} = \text{education}^{*}_{it} + e_{it} \]
\[ \log(\text{wage})_i = \alpha + \beta\text{education}^{*}_i + \epsilon_i \]
Observed model substitutes \(\text{education}_i = \text{education}^{*}_i + e_i\)
\[ \text{plim}\;\hat\beta = \beta \cdot \frac{\text{Var}(\text{educ}^*)}{\text{Var}(\text{educ}^*) + \text{Var}(e)} \]
The ratio is less than 1, so the coefficient is biased towards zero (derived in Topic 6).
\[ \Delta\text{education}_i = \Delta\text{education}^{*}_{i} + (e_{i2} - e_{i1}) \]
\[ \text{Var}(e_{i2} - e_{i1}) = \text{Var}(e_{i1}) + \text{Var}(e_{i2}) \]
\[ \text{plim}\;\hat\beta_{\text{FD}} = \beta \cdot \frac{\text{Var}(\Delta\text{educ}^*)}{\text{Var}(\Delta\text{educ}^*) + \text{Var}(e_{i1}) + \text{Var}(e_{i2})} \]
Let \(\text{same}_{it} = \mathbb{1}[i\text{ has same-nationality roommate in } t]\)
\[ \text{grades}_{it} = \alpha + \beta\;\text{same}_{it} + e_{it} \]
\(\text{grades}_{i1} = \alpha + \beta\;\text{same}_{i1} + \alpha_i + u_{i1}\)
\(\text{grades}_{i2} = \alpha + \beta\;\text{same}_{i2} + \alpha_i + u_{i2}\)
\(\Delta\text{grades}_i = \beta\;\Delta\text{same}_i + \Delta u_i\)
Ideal model: \(\text{drug usage}_{it} = \mu\;\text{post}_t + \theta_i + \rho_t + e_{it}\)
Conditional expectations — with \(T = 4\), treatment at \(t = 3\):
\[\begin{align*} \mathbb{E}[\text{drug usage}_{it} \mid t=1] &= \theta_i + \rho_1 \\ \mathbb{E}[\text{drug usage}_{it} \mid t=2] &= \theta_i + \rho_2 \\ \mathbb{E}[\text{drug usage}_{it} \mid t=3] &= \mu + \theta_i + \rho_3 \\ \mathbb{E}[\text{drug usage}_{it} \mid t=4] &= \mu + \theta_i + \rho_4 \end{align*}\]
\[\text{drug usage}_{it} = \mu\text{post}_t + \theta_i + \gamma t + e_{it}\]
| Parametric trend | Day FE | |
|---|---|---|
| Flexibility | Low (linear) | High (any shape) |
| Estimate \(\mu\)? | Yes | No (collinear) |
| Risk | Misspecified trend | No identification |
When treatment varies only at the time level, time FE absorb it completely.
\[ \log(\text{earnings})_{it} = \alpha_i + \theta_t + \sum_{j=17}^{85} \gamma_j\;\mathbb{1}[\text{age}_{it} = j] + e_{it} \]
Each age dummy \(d_j = \mathbb{1}[\text{age}_{it} = j]\) is a binary variable with proportion \(p_j = n_j/n\):
\[ \text{Var}(\hat{\gamma}_j) \propto \frac{\sigma^2}{n \cdot p_j(1 - p_j)} \]
Nonparametric flexibility comes at the cost of imprecision where data is thin.
A firm introduces performance pay. The panel is unbalanced: some workers leave (exiters), some stay (stayers), some join (entrants).
\[ \log(\widehat{\text{productivity}})_{it} = \hat{\alpha}_i + \hat{\beta}\;\text{performance pay}_t \]
OLS captures total change; FE isolates the within-unit mechanism. The difference is the selection channel.
\[ \text{productivity}_{it} = \beta_1\;\text{contingent}_t + \beta_2\;\text{weather}_t + \beta_3\;\text{width}_{it} + \beta_4\;\text{height}_{it} + e_{it} \]
\[\begin{align*} \text{productivity}_{it} &= \beta_1\;\text{contingent}_t + \beta_2\;\text{weather}_t + \beta_3\;\text{width}_{it} + \beta_4\;\text{height}_{it} \\ &+ \gamma_i + e_{it} \end{align*}\]
Controls handle time-varying confounders; FE handles time-invariant heterogeneity.
\[ \text{productivity}_{ijt} = \lambda_i + \mu_j + \theta_t + e_{ijt} \]
Multiple fixed effects require sufficient rotation across dimensions for identification.
Bertrand & Mullainathan (2004)
Write the model for \(t = 1\) and \(t = 2\):
\[\begin{align*} y_{i1} &= \beta_0 + \beta_1 x_{i1,1} + \cdots + \beta_k x_{i1,k} + a_i + v_{i1} \\ y_{i2} &= (\beta_0 + \delta) + \beta_1 x_{i2,1} + \cdots + \beta_k x_{i2,k} + a_i + v_{i2} \end{align*}\]
Subtract: \(a_i - a_i = 0\).
\[ \Delta y_i = \delta + \beta_1\Delta x_{i1} + \beta_2\Delta x_{i2} + \cdots + \beta_k\Delta x_{ik} + \Delta v_i \]
\[ \text{plim}\hat\beta_{\text{CS}} = \beta \cdot \frac{\text{Var}(\text{educ}^*)}{\text{Var}(\text{educ}^*) + \text{Var}(e)} \]
\[ \text{plim}\;\hat\beta_{\text{FD}} = \beta \cdot \frac{\text{Var}(\Delta\text{educ}^*)}{\text{Var}(\Delta\text{educ}^*) + \text{Var}(e_{i1}) + \text{Var}(e_{i2})} \]
\[ \log(\text{earnings})_{it} = \alpha_i + \theta_t + \sum_{j=17}^{85} \gamma_j\;\mathbb{1}[\text{age}_{it} = j] + e_{it} \]
suffers from a fundamental identification problem:
\[ \text{age}_{it} = \text{year}_t - \text{birth year}_i \]
\[\begin{align*} \hat{\beta}_{\text{OLS}} &= \bar{y}_{\text{post}} - \bar{y}_{\text{pre}} \\ &= \left[\frac{|S|}{|S|+|N|}\bar{y}^S_{\text{post}} + \frac{|N|}{|S|+|N|}\bar{y}^N_{\text{post}}\right] \\ &\quad - \left[\frac{|S|}{|S|+|X|}\bar{y}^S_{\text{pre}} + \frac{|X|}{|S|+|X|}\bar{y}^X_{\text{pre}}\right] \end{align*}\]
\[ \hat{\beta}_{\text{FE}} = \frac{1}{|S|}\sum_{i \in S}(y_{i,\text{post}} - y_{i,\text{pre}}) \]
\(|S|\), \(|X|\), \(|N|\) = number of stayers, exiters, entrants.
The call centre exercise (Q12) builds on Abowd et al. (1999) and Fenizia (2022).
\[ \log(\text{wages})_{it} = \alpha_i + \psi_{J(i,t)} + x'_{it}\beta + e_{it} \]
Treatment starts on day 3:
| Day | \(d_2\) | \(d_3\) | \(d_4\) | \(\text{post}_t\) |
|---|---|---|---|---|
| 1 | 0 | 0 | 0 | 0 |
| 2 | 1 | 0 | 0 | 0 |
| 3 | 0 | 1 | 0 | 1 |
| 4 | 0 | 0 | 1 | 1 |
\(\text{post}_t = d_3 + d_4\) — an exact linear combination of the day dummies. Stata would drop one variable automatically. The treatment effect \(\mu\) cannot be separated from the day effects.